FedMECA: Scalable Federated Learning via Memory-Efficient and Concurrent Aggregation
Zhonghao Chen (University of Florida), Duo Zhang (University of California, Merced), Xiaoyi Lu (University of Florida)
System Optimization & Efficiency
FedMECA is a federated learning aggregation system that decouples model collection from aggregation to overcome scalability failures as client counts or model sizes grow. By enabling concurrent, memory-efficient aggregation, it makes federated training viable at scales where existing systems stall.
Presentation
Talk
Paper Session 6: Learning & Control
Thursday, May 28 · 3:50 PM – 4:00 PM
Bayshore Ballroom
Poster
Thursday, May 28 · 4:30 PM – 6:00 PM
Carmel
Abstract
Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy, but it faces growing scalability issues, causing FL to fail as the number of participating clients or model size increases. Existing aggregation paradigms usually overlook the memory and computational challenges arising from the tightly coupled processes of model collection and aggregation. Within such paradigms, aggregation necessitates waiting for all selected client updates, and the process is computationally demanding. To overcome these limitations, we propose FedMECA, a scalable, memory-efficient, and concurrency-aware aggregation framework for FL. FedMECA is designed to decouple model collection from aggregation, alleviating memory pressure on the central server by \boldsymbol36.57× and achieving up to \boldsymbol238.5× speedup in aggregation runtime without compromising model accuracy or convergence speed. FedMECA is designed with minimal system complexity and can support clients with heterogeneity and non-IID data. Moreover, our approach is easily extensible to aggregation strategies at different synchrony, offering flexibility and adaptability across diverse FL applications. These results demonstrate that FedMECA enables scalable and efficient training for modern large-scale FL workloads.